Image Binarization Based On ICA Approach for Optical Character Recognition
نویسندگان
چکیده
Image binarization plays a vital role in text segmentation which is used in OCR application. Binarization of text in degraded images is a challenging task due to the variations in colour, size, and font of the text and the results are often affected by complex backgrounds, different lighting conditions, shadows and reflections. A robust solution to this problem can significantly enhance the accuracy of scene text recognition algorithms leading to a variety of applications such as scene understanding, automatic localization, navigation, and image retrieval. In this paper, we propose a novel method to extract and binarize text from images that contains complex background. We use an Independent Component Analysis (ICA) based technique to map out the text region, which is inherently uniform in nature, while removing shadows, specularity and reflections, which are included in the background. This algorithm works better on images with different degradations. We implement our method on DIBCO dataset then we compare our robust algorithm with state-of-art criteria like binarization based on Otsu method and we can prove that our algorithm will give better results.
منابع مشابه
A Quad Tree Based Binarization Approach to Improve quality of Degraded Document Images
This paper proposes a novel binarization algorithm for converting the grayscale and color images into black and white images. The binarization is one of the very important process in all the researches pertaining to the field of the Document image processing and Pattern recognition. Since quality of binary image plays a critical role in the further processing of the document, especially in the ...
متن کاملParallel Implementation of Otsu’s Binarization Approach on GPU
Fast algorithms are important for efficient image processing systems for handling large set of calculations. To speedup the processing, parallel implementation of an algorithm can be done using Graphics Processing Unit (GPU). GPU is general purpose computation hardware; programmability and low cost make it productive. Binarization is widely used technique in the image analysis and recognition a...
متن کاملOCR binarization and image pre-processing for searching historical documents
We consider the problem of document binarization as a pre-processing step for optical character recognition (OCR) for the purpose of keyword search of historical printed documents. A number of promising techniques from the literature for binarization, pre-filtering, and post-binarization denoising were implemented along with newly developed methods for binarization: an error diffusion binarizat...
متن کاملTowards Text Recognition in Natural Scene Images
In this paper, we propose a novel methodology for text detection in natural scene images. The proposed methodology is based on an efficient binarization and enhancement technique followed by a suitable connected component analysis procedure. Image binarization successfully processes natural scene images having shadows, non-uniform illumination, low contrast and large signaldependent noise. Conn...
متن کاملUnsupervised measures for parameter selection of binarization algorithms
In this paper, we propose a mechanism for systematic comparison of the efficacy of unsupervised evaluation methods for parameter selection of binarization algorithms in optical character recognition (OCR). We also analyze these measures statistically and ascertain whether a measure is suitable or not to assess a binarization method. The comparison process is streamlined in several steps. Given ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014